Graphs

Introduction

Graph is another important non-linear data structure. In Chapter 7, we have discussed tree structure, which is in fact, a special kind of graph structure. In tree structure, there is a hierarchical relationship between parent and children, that is, one parent and many children. On the other hand, in graph, relationship is less restricted. Here relationship is from many parents to many children. Figure11.1 represents the two non-linear data structures.

Fig. 11.1 Two non-linear data structures: tree and graph.

Graph structures can be seen everywhere in the real world. Some examples are given below so as to understand this structure better.

Airlines

Cities are connected through airlines. This can be represented through graph structure which is shown in Figure 11.2(a). Here, airports are represented by solid dots and airlines by lines between the dots.

Source–destination network

Figure 11.2(b) represents a network connection of three commodities: electricity, gas and water among three distant destinations D1, D2 and D3.

Konigsberg’s bridges

In Eastern Prussia, there is a city Konigsberg. This city is surrounded by four lands A, B, C and D which are divided by the river Pregal. Seven bridges are there to connect four lands. This geographical description can be easily represented through a graph structure which is shown in Figure 11.2(c). A classical problem is related with these Konigsberg’s bridges: to determine, whether it is possible to cross all the bridges exactly once and return to the starting land area. This problem is known as the Konigsberg’s bridge problem.

Flowchart of a program

Flowchart of a program is in fact the graphical representation of an algorithm of a problem. Figure 11.2(d) represents a flowchart.

Similarly, almost in all application areas, graph structures exist.

Fig. 11.2 Continued.

(d) Flowchart – a graphical representation of program

Fig. 11.2 Some examples of graph structures.

Graph Terminologies

In this section, we shall discuss the important terminologies associated with the theory of graphs. In fact, there is no standard terminologies in graph theory. Readers may follow the terminologies mentioned in this section for subsequent discussions. Various terminologies discussed in this section are illustrated in Figure 11.3.

Graph. A graph G consists of two sets:

(i) A set V, called set of all vertices (or nodes)

(ii) A set E, called set of all edges (or arcs). This set E is the set of pair of elements from V.

For example, let us consider the graph G1 in Figure 11.3. Here

V = {v₁, v₂, v₃, v₄}

E = {(v₁, v₂), (v₁, v₃), (v₁, v₄), (v₂, v₃), (v₃, v₄)}

Digraph. A digraph is also called a directed graph. It is a graph G, such that, G = <V, E>, where V is the set of all vertices and E is the set of ordered pair of elements from V. For example, graph G2 is a digraph, where

V = {v₁, v₂, v₃, v₄}

E = {(v₁, v₂), (v₁, v₃), (v₂, v₃), (v₃, v₄), (v₄, v₁)}

Here, if an order pair (v_i, v_j) is in E then there is an edge directed from v_i to v_j (indicated by arrow sign).

Note that, in case of undirected graph (simple graph), pair (v_i, v_j) is unordered, that is,
(v_i, v_j) and (v_j, v_i) are the same edges, but in case of digraph they correspond to two different edges.

Fig. 11.3 Various graphs.

In our subsequent discussion, an undirected graph will be simply termed as graph and directed graph will be termed as digraph.

Weighted graph. A graph (or digraph) is termed as weighted graph if all the edges in it are labeled with some weights. For example, G3 and G4 are two weighted graphs in Figure 11.3.

Adjacent vertices. A vertex v_i is adjacent to (or neighbour of) another vertex say, v_j, if there is an edge from v_i to v_j. For example, in G11 (Figure 11.3), v₂ is adjacent to v₃ and v₄, v₁ is not adjacent to v₄ but to v₂. Similarly the neighbour of v₃ in graph G2 are v₁ and v₂ (but not v₄).

Self loop. If there is an edge whose starting and end vertices are same, that is, (v_i, v_i) is an edge, then it is called a self loop (or simply, a loop). As an example, the graph G5 (in Figure 11.3) has a self loop at vertex v₄.

Parallel edges. If there are more than one edges between the same pair of vertices, then they are known as the parallel edges. For an example, there are two parallel edges between v₁ and v₂ in graph G5 of Figure 11.3. A graph which has either self loop or parallel edges or both is called multigraph. In Figure 11.3, G5 is thus a multigraph.

Simple graph (digraph). A graph (digraph) if it does not have any self loop or parallel edges is called a simple graph (digraph). In Figure 11.3, all graphs (digraphs) except G5 are simple.

Complete graph. A graph (digraph) G is said to be complete if each vertex v_i is adjacent to every other vertex v_j in G. In other words, there are edges from any vertex to all other vertices. For examples, G6 and G9 are two complete graphs.

Acyclic graph. If there is a path containing one or more edges which starts from a vertex v_i and terminates into the same vertex then the path is known as a cycle. For example there is a cycle in both G1 and G2 (Figure 11.3). If a graph (digraph) does not have any cycle then it is called acyclic graph. For examples, in Figure 11.3, G4, G7 are two acyclic graphs.

Isolated vertex. A vertex is isolated if there is no edge connected from any other vertex to the vertex. For example, in G8 (Figure 11.3) the vertex u is an isolated vertex.

Degree of vertex. The number of edges connected with vertex v_i is called the degree of vertex v_i and is denoted by degree(v_i). For example, degree(v_i) = 3, Ú v_i Î G6 (Figure 11.3).

But for a digraph, there are two degrees: indegree and outdegree. Indegree of v_i denoted as indegree(v_i) = number of edges incident into v_i. Similarly, outdegree(v_i) = number of edges emanating from v_i. For example, let us consider the digraph G4. Here:

indegree (v₁) = 2, outdegree (v₁) = 1

indegree (v₂) = 2 outdegree (v₂) = 0

indegree (v₃) = 1 outdegree (v₃) = 2

indegree (v₄) = 0 outdegree (v₄) = 2

Pendant vertex. A vertex v_i is pendant if its indegree(v_i) = 1 and outdegree(v_i) = 0. For example, in G8 v¢ is a pendant vertex. In G7, there are four pendant vertices v₄, v₅, v₆ and v₇.

Connected graph. In a graph (not digraph) G, two vertices v_i and v_j are said to be connected if there is a path in G from v_i to v_j (or v_j to v_i). A Graph is said to be connected if for every pair of distinct vertices v_i, v_j in G, there is a path. For example, G1, G3 and G6 are connected graphs but G8 is not (Figure 11.3).

A digraph G is said to be strongly connected if for every pair of distinct vertices v_i, v_j in G, there is a directed path from v_i to v_j and also from v_j to v_i. For example, digraph G11 is strongly connected but digraphs G10 and G12 are not.

If a digraph is not strongly connected but the underlying graph (without direction of the edges) is connected, then the graph is said to be weakly connected.

We have discussed many terms associated in the theory of graph. As the theory is vast, we will limit our discussions only with four varieties of graphs: directed, undirected, weighted and non-weighted graphs.

Representation of Graphs

A graph can be represented in many ways. Some of these representations are as under:

1. Set representation

2. Linked representation

3. Sequential (matrix) representation.

To illustrate the above representation, let us assume the graph structures as shown in Figure 11.5.

Fig. 11.5 Types of graphs.

Set Representation

This is one of the straightforward methods of representing a graph. With this method, two sets are maintained: (i) V, the set of vertices, (ii) E, the set of edges, which is the subset of V ´ V. But if the graph is weighted, the set E is the ordered collection of 3 tuples, that is, E = W ´ V ´ V, where W is the set of weights.

Let us see, how all the graphs as given in Figure 11.5 can be represented with this technique.

Graph G1

V(G1) = {v₁, v₂, v₃, v₄, v₅, v₆, v₇}

E(G1) = {(v₁, v₂), (v₁, v₃), (v₂, v₄), (v₂, v₅), (v₃, v₆), (v₃, v₇)}

Graph G2

V(G2) = {v₁, v₂, v₃, v₄, v₅, v₆, v₇}

E(G2) = {(v₁, v₂), (v₁, v₃), (v₂, v₄), (v₂, v₅), (v₃, v₄), (v₃, v₆), (v₄, v₇), (v₅, v₇), (v₆, v₇)}

Graph G3

V(G3) = {A, B, C, D, E}

E(G3) = {(A, B), (A, C), (C, B), (C, A), (D, A), (D, B), (D, C), (D, E), (E, B)}

Graph G4

V(G4) = {A, B, C, D}

E(G4) = {(3, A, C), (5, B, A), (1, B, C), (7, B, D), (2, C, A), (4, C, D), (6, D, B), (8, D, C)}

Note that, if the graph is a multigraph and undirected, this method does not allow to store parallel edges, as in a set, two identical elements cannot exist. Although it is a straightforward representation, and from the memory point of view it is most efficient, but so far manipulation of graph is concerned this method of representation is not useful.

Linked Representation

Linked representation is another space-saving way of graph representation. In this representation, two types of node structures is assumed as shown in Figure 11.6.

Fig. 11.6 Node structures in linked representation.

Now, let us see, how the four graphs in Figure 11.5 can be represented with the linked representation. The linked representations of graphs in Figure 11.5 is shown in Figure 11.7.

Fig. 11.7 Continued.

Fig. 11.7 Linked representation of graphs as given in Figure 11.5.

Observe that, in the linked representation of graphs, the number of lists depends on the number of vertices in the graph. The header node in each list maintains a list of all adjacent vertices of a node for which header node is meant.

Matrix Representation

Matrix representation is the most useful way of representing any graph. This representation uses a square matrix of order n ´ n, n being the number of vertices in the graph. A general representation is as shown in Figure 11.8.

Fig. 11.8 Matrix representation of graph.

Entries in the matrix can be decided as follows:

a_ij = 1, if there is an edge from v_i to v_j

= 0, otherwise

This matrix is known as adjacency matrix because an entry stores the information whether two vertices are adjacent or not. Also, the matrix is alternatively termed as bit matrix or Boolean matrix as the entries are either 0 or 1.

Adjacency matrix is also useful to store multigraph as well as weighted graph. In case of multigraph representation, instead of entry 1, the entry will be the number of edges between two vertices. And in case of weighted graph, the entries are weights of the edges between the vertices instead of 0 or 1. Figure 11.9 shows the adjacency matrix representation of the graphs G1, G2, G3 and G4 as given in Figure 11.5.

Fig. 11.9 Adjacency matrix representation of graphs as given in Figure 8.5.

Comparison among various representations

The adjacency matrix representation has a disadvantage that it always requires n ´ n matrix with n vertices, regardless of the number of edges. If the graph is very sparse, majority of the entries are null. But from the manipulation point of view, this representation is the best.

Set representation of graph is very concise but manipulation with this representation has a lot of difficulties. In fact, this kind of representation has only theoretical interest. So far insertion, deletion, searching and merging are concerned, in some cases, linked representation of graph has an advantage, but considering overall performance, matrix representation of graph is more powerful than all. This is why, in our subsequent discussions, we will emphasize on matrix representation of graphs.

Advantages of adjacency matrix representation of graphs

In this section, we are to elaborate how some extra information can be retrieved easily from a graph if it is represented with an adjacency matrix.

Lemma 11.1

If A be an adjacency matrix of a graph G, and if A = A^T, where A^T is the transpose matrix of A, then G is a simple undirected graph.

Proof : Let a_ij denotes an entry in the i-th row and j-th column of the matrix A. Now, in case of a simple (no loop, no parallel edges) undirected graph, a_ij = 1(0) implies that there is a path (no path) from v_i to v_j then there is also path (no path) from v_j to v_i, that is, a_ji = 1(0). Thus, for a simple undirected graph a_ij = a_ji. Thus, the matrix is symmetric. Since, for any symmetric matrix A = A^T, the result follows.

Corollary 1: For a simple digraph, A ¹ A^T.

Example 11.1 Figure 8.11 shows two simple graphs and their adjacency matrices. For a simple undirected graph, A = A^T; and for a simple directed graph, A ¹ A^T.

Fig. 11.11 Illustration of Lemma 8.1.

Corollary 2: If the graph is simple then all the diagonal entries are 0.

Lemma 11.2

Suppose G is a digraph and A be its adjacency matrix. Then the diagonal entries of A - A^T gives the outdegrees of all vertices in G, and the diagonal entries of A^T-A gives the indegrees of all vertices in G.

Proof: Let us first consider the case of A - A^T. Let B = A - A^T and suppose b_ij be an element in the i-th row and j-th column of B. In general,

where i, j = 1, 2, ..., n. The value of a_ik × a_jk = 1 if and only if both a_ik and a_jk are 1; otherwise a_ik × a_jk = 0. Now, a_ik = 1 if (v_i, v_k) Î E and a_jk = 1 if (v_j, v_k) Î E. If both (v_i, v_k) and (v_j, v_k) are the edges of the graph for some fixed k, then we get a contribution of 1 in the summation expressing b_ij. Therefore, the values of b_ij shows the number of vertices which are connected both from v_i and v_j.

Now, if i = j, then

and = 1 if a_ik = 1, that is, if (v_i, v_k) Î E, or in other words, there is an edge directed from v_i to v_k. Thus, total contribution in summation expression gives the total outgoing edges from v_i. Therefore, the diagonal entries of A × A^T implies the outdegrees of the vertices. Similar conclusion can be drawn for A^T × A but for indegrees and proof of which is left as an exercise.

Example 11.2 Figure 11.12 shows a digraph with its adjacency matrix. A - A^T and A^T - A are computed as shown.

Fig. 11.12 Computation of indegrees and outdegrees of all vertices in a digraph.

Lemma 11.3

Let A be an adjacency matrix of a digraph. A^L = , gives the number of paths of length L from v_i to v_j, where A^L is the L-th power matrix of A.

Proof Naturally an entry of 1 in the i-th row and j-th column of A shows the existence of an edge (v_i, v_j), that is, a path of length 1 from v_i to v_j. Let us denote the elements of A² by a²_ij. Then

For any value of k, a_ik × a_ij = 1 if both a_ik and a_kj equal 1, that is, iff (v_i, v_k) and (v_k, v_j) are the edges of the graph. For each such k, we get a contribution of 1 in the summation expression. Now, (v_i, v_k) and (v_k, v_j) imply that there is a path from v_i to v_j of length 2. Therefore, a²_ij = 1 means a path of exactly length 2 from v_i to v_j. By a similar argument, one can show that the element in the i-th row and j-th column of A³ gives the number of paths of exactly length 3 from v_i to v_j. Therefore, it follows the proof.

Example 11.3 For the digraph G in Figure 8.13, A², A³ and A⁴ are computed as shown:

Fig. 11.13 Computation of number of paths in a digraph.

As illustrated in the figure, in case of A³, = 1, that is there is only one path of length 3 from vertex 2 to vertex 3 which is [2-3-2-3], on the other hand, from A⁴ it is observed that = 2, that is, there are 2 different paths of length 4, such as [2-4-1-2-3], [2-3-1-2-3].

Note: In the above computation, A^L not necessarily gives the count of elementary paths (the path which does not cross one vertex more than once).

Path matrix

Before going to the next important Lemma, let us introduce the concept of path matrix. Let G be a simple graph with n vertices v₁, v₂, ..., v_n. An n ´ n matrix P = [p_ij] whose entries are defined as:

p_ij = 1, if there exists a path from v_i to v_j

= 0, otherwise

then P is called the path matrix or reachability matrix of the graph G.

Let us discuss how path matrix P of a given graph G can be obtained from its adjacency matrix A. From the adjacency matrix of A we can immediately determine whether there exists an edge from v_i to v_j. Also from the matrix A^L, where L is some positive integer, we can establish the number of paths of length L from v_i to v_j. Add the matrices A, A², A³, ..., A^L so that

B^L = A + A² + A³ + . . . + A^L

then from the matrix B^L, we can determine the number of paths of length less than or equal to L from any v_i to v_j. If we wish to determine whether v_j is reachable from v_i, it would be necessary to find whether there exists a path of any length from v_i to v_j.

Lemma 11.4

In a simple digraph, the length of any elementary path is less than or equal to
n – 1, where n is the number of vertices in the graph. Similarly, the length of any elementary cycle does not exceed n.

Proof The proof is based on the fact that in any elementary path the node appearing in the sequence are distinct. The number of distinct nodes in an elementary path of length l is
l + 1, since there are only n distinct nodes in the graph, we cannot have an elementary path of length greater than n – 1.

For an elementary cycles of length l, the sequence contains l distinct nodes hence the result.

In a graph, all the elementary cycles or path can be determined from the matrix B_n where

B_n = A + A² + A³ + . . . + Aⁿ

n being the number of vertices in the graph and A being the adjacency matrix of the graph.

For an example, B₄ of the graph as presented in Figure 8.13, can be obtained as:

The element in the i-th row and j-th column of B₄ shows the number of paths having length 4 or less than those exist between v_i to v_j in the graph.

From the matrix B_n, we can obtain the path matrix of the graph as follows: Let P = [p_ij] be a path matrix. Then p_ij = 1 iff there is a non-zero element in the i, j entry of the matrix B_nelse p_ij = 0. Thus, the path matrix of the running example is shown above as P. [Note: The path matrix only shows the presence or absence of at least one path between a pair of points and also the presence or absence of a cycle at any node.]

Lemma 11.5

(a) A vertices v_i contains a cycle if the i, i entry in the path matrix P is 1.

(b) A graph is strongly connected if for all v_i, v_j Î G, both the i, j entry and j, i entry in the path

matrix are 1.

Proof Proof can be easily followed from the previous discussion, and left as an exercise.

As an example, for the graph as given in Figure 8.13, and whose path matrix P is obtained as above, we can conclude that, it is strongly connected and all the vertices have cycle.

Lemma 11.6

Let P be a path matrix of a graph G. If P^T is the transpose of the matrix P, then the i-th row of the matrix P * P^T, [which is obtained by the element-wise product (AND) of the elements] gives the strong component containing v_i.

Proof If v_j is reachable from v_i then clearly p_ij = 1; also if v_i is reachable from v_j, then p_ji = 1. Therefore the element in the i-th row and j-th column of P * P^T is 1 iff v_i and v_j are mutually reachable. This is true for all j. Hence the result.

Example 11.4 As an example, consider the graph as given in Figure 11.14. Its path matrices P and P * P^T are calculated as shown.

Fig. 11.14 Path matrices and strong component of G.

From P * P^T, strong component is the sub-graph containing the vertices 2, 3 and 4, that is, any vertex is reachable from all these vertices.

OPERATIONS ON GRAPHS

The important operations possible on graphs are listed below:

Insertion

(a) To insert a vertex and hence establishing connectivity with other vertices in the existing graph.

(b) To insert an edge between two vertices in the graph.

Deletion

(a) To delete a vertex from the graph.

(b) To delete an edge from the graph.

Merging

To merge two graphs G1 and G2 into a single graph.

Traversal

To visit all the vertices in the graph.

The implementation of the above operations depends on the way a graph is being represented. We will consider only the linked representation and matrix representation of the graph. We will discuss about the implementation of the above mentioned operations, when the graph is represented with linked list. Further, the same will be implemented when the graph is represented using matrix.

Operations on Linked List Representation of Graphs

In order to generalize the implementation, we will assume two data structures in this representation. One is array of vertices and other is a single linked list. The two structures are shown in
Figure 11.16. Here, the first data structure, is an array of vertices having two fields: LABEL-label for the vertices (we will assume this field of integer type) and LINK-the pointer to a linked list. The second data structure, namely, a linked list, to maintain the list of all adjacent vertices for any vertex v_i for which it is meant. So, if n vertices are there in the graph, then n linked lists have to be maintained. The node structure in the linked list may be noted. A node structure has two fields other than the field LINK. The first field WEIGHT is to store the weight of the edge and the second field LABEL to store the vertex’s label. Thus, this structure is sufficient to store any kind of graphs—directed, undirected, weighted or non-weighted graph.

Fig. 11.16 Data structures for linked list representation of graphs.

With this preliminary, let us consider the various operations on graphs.

Insertion

If we insert a vertex into a graph, the different steps involved are illustrated as in Figure 11.17. From Figure 11.17, it is evident that the insertion procedure differs for undirected graph and directed graph.

Fig. 11.17 Insertion of a vertex into (a) an undirected graph and (b) a digraph.

In case of insertion of a vertex into an undirected graph, if v_x is inserted and v_i be its adjacent vertex then v_i has to be incorporated in the adjacency list of v_x as well as v_x has to be incorporated in the adjacency list of v_i.

On the other hand, if it is a digraph and if there is a path from v_x to v_i, then we add a node for v_i into the adjacency list of v_x; if there is an edge from v_i to v_x, add a node for v_x in the adjacency list of v_i.

Following Algorithms InsertVertex_LL_UG is to insert a vertex into a graph and InsertVertex_LL_DG is to insert a vertex into a digraph.

Algorithm InsertVertex_LL_UG

Input: V_x, the new vertex that has to be inserted with

X = [v_x₁, v_x₂, . . ., v_x_l], l number of adjacent vertices to the vertex V_x.

Let N = Number of vertices currently present in the graph.

Output: A graph with new vertex v_x and its adjacent edges v_xi, i = 1, 2, ..., l, if adjacent vertices exist.

Data structure: Linked structure of undirected graph and UGPtr is the pointer to it.

Steps:

1.	N = N + 1, V_x = N // Number of vertices is increased by 1 and label of the new vertex is N
	/ To add the adjacency list of the new vertex, V_x in the graph /
2.	For i = 1 to l do
3.		Let j = X[i] // j is the label of i-th adjacent vertex
4.		If (j ³ N) then // This label is not in graph
5.			Print “No vertex labeled X[i] exist: Edge from V_x to X[i] is not established”.
6.		Else
7.			InsertEnd_SL (UGptr[N], X[i]) // Insert X[i] into the list of vertices
8.			InsertEnd_SL (UGptr[j],V_x) // To establish edges from X[i] to V_x
9.		EndIf
10.	EndFor
11.	Stop

Now, let us describe the algorithm InsertVertex_LL_DG to insert a vertex into a digraph.

Algorithm InsertVertex_LL_DG

Input: V_x, the new vertex that has to be inserted.

X = [v_x₁, v_x₂, ..., v_xm], the list of adjacent vertices that has edges from V_x to v_xi,

i = 1, 2, ..., m.

Y = [v_y₁, v_y₂, ..., v_yn], the list of adjacent vertex that has edges from v_yi (i = 1, 2, ..., n) to V_x.

Let N be the number of vertices currently present in the graph.

Output: A graph with new vertex V_x and directed edges from V_x to v_xi, i = 1, 2, ..., and from v_yi (i = 1, 2, ...,

n) to V_x, if such v_xi and v_yi exist.

Data structure: Linked structure of undirected graph and DGptr is the pointer to it.

Steps:

1.	N = N + 1, V_x = N // New vertex V_x is counted as next numbered in the graph
	/ To set the edge from V_x to v_xi, i = 1, 2, ..., m /
2.	For i = 1 to m do
3.		Let j = X[i] // j is the label of i-th adjacent vertex of V_x
4.		If j ³ N then // j does not exist in the graph
5.			Print “No vertex labelled X[i] does not exist: Edge from V_x to X[i] is not established”.
6.		Else // Set the edge
7.			InsertEnd_SL(DGptr[N], X[i])
8.		EndIf
9.	EndFor
	/ To set edge from v_yi, i = 1, 2, ..., n to V_x/
10.	For i = 1 to n do
11.		Let j = Y[i] // j is the label of i-th adjacent vertex
12.		If j ³ N then // j does not exist in the graph
13.			Print “No vertex labelled Y[i] does not exist: Edge from Y[i] to V_x is not established”
14.		Else // Set the edge
15.			InsertEnd_SL(DGptr[j],V_x)
16.		EndIf
17.	EndFor
18.	Stop

In the above two algorithms, we have assumed the procedure InsertEnd_SL(. . . ); this procedure is already discussed in Section 3.2.2. It is also assumed that the size of the graph is unlimited and the labels of the vertices are all integer numbers.

So far we have discussed the case of insertion of a vertex into a graph. Now, let us discuss the case of adding an edge between v_i and v_j in a graph G. If the graph G is an undirected graph then adding an edge means inclusion of vertex v_j in the adjacency list of v_i and also inclusion of vertex v_i in the adjacency list of v_j. On the other hand, if G is the digraph and if we want to add an edge from v_i to v_j, then we have to include the vertex v_j into the adjacency list of v_i. Following are the two algorithms InsertEdge_LL_UG and InsertEdge_LL_DG to insert an edge in undirected and digraph, respectively.

Algorithm InsertEdge_LL_UG

Input: <V_i, V_j>, the edge to be inserted between vertices v_i and v_j.

Output: The graph with edge added between v_i and v_j.

Data structure: Linked structure of undirected graph and UGptr is the pointer to it.

Steps:

1.	Let N = number of vertices in the graph
2.	If (V_i > N) or (V_j > N) then
3.		Print “Edge is not possible between V_i and V_j”
4.	Else
5.		InsertEnd_SL(UGptr[V_i], V_j) // Add V_j in the adjacency list of V_i
6.		InsertEnd_SL(UGptr[V_j], V_i) // Add V_i in the adjacency list of V_j
7.	EndIf
8.	Stop

Now, let us describe the algorithm InsertEdge_LL_DG to insert an edge into a digraph.

Algorithm InsertEdge_LL_DG

Input: <V_i, V_j>, the edge to be inserted from vertices V_i and V_j.

Output: The graph with edge added between V_i and V_j.

Data structure: Linked structure of undirected graph and DGptr is the pointer to it.

Steps:

1.	Let N = number of vertices in the graph
2.	If (V_i > N) or (V_j > N) then
3.		Print “Edge is not possible between V_i and V_j”
4.	Else
5.		InsertEnd_SL(DGptr[V_i], V_j) // Add V_j in the adjacency list of V_i
6.	EndIf
7.	Stop

Deletion

This operation is again different for undirected graph and directed graph (digraph). Let us first consider the case of deletion of a vertex in an undirected graph.

Refer Figure 11.17(a). Suppose we want to delete the vertex v₈ from the graph. To do this, first we have to look for the adjacency list of v₈. All the vertices which are present in the adjacency list of v₈, the node labelled v₈ has to be deleted from the adjacency lists of those vertices. For example, in the adjacency list of v₈, two vertices namely v₁ and v₄ are present. So, we have to delete the node labelled v₈ from the adjacency list of v₁ and v₄.

The above operation is formulated in the algorithm DeleteVertex_LL_UG.

Algorithm DeleteVertex_LL_UG

Input: V_x, the label of the vertex which has to be removed from the graph.

Let N be the number of vertices presently available in the graph.

Output: The reduced graph without the vertex V_x and its associated edges.

Data structure: Linked structure of undirected graph and UGptr is the pointer to it.

Steps:

1.	If (N = 0) then
2.		Print “Graph is empty : No deletion”
3.		Exit
4.	EndIf
5.	ptr = UGptr[V_x]→LINK // Move to the first node in the list of V_x
6.	While (ptr ¹ NULL) do // For all nodes in the adjacency list
7.		j = ptr→LABEL //Get the label of the vertex
8.		DeleteAny_SL(UGptr[j], V_x) // Delete the node having label V_x from adjacency list
9.		DeleteAny_SL(UGptr[V_x], j) // Delete the node having label ‘j’ from adjacency list of V_x
10.		ptr = UGptr[V_x] →LINK // Next node in adjacency list of V_x
11.	EndWhile
13.	UGptr[V_x] →LABEL = NULL // Make the entry NULL
14.	UGptr[V_x] →Link = NULL
15.	ReturnNode (ptr)
16.	N = N – 1
17.	Stop

The algorithm DeleteVertex_LL_UG uses the procedure DeleteAny_SL(...), which is already defined in
Module 5 (Linked Lists).

Now, let us discuss the case of deleting a vertex from a digraph. This procedure is a little bit complex. Let us refer to Figure 11.17(b). Suppose, we want to delete the vertex v₈ from the digraph as shown in Figure 11.17(b). To do so, we should dispose off whole the adjacency list of v₈. This removes all the edges which are emanated from v₈ (namely, v₁ in the graph). Next, we have to search the adjacency list for all vertices, if there is any vertex v which has an edge from v to v₈. For example, there is the vertex v₄ only which has an edge from v₄ to v₈. So, v₈ should be removed from the adjacency list of v₄. The above operation is generalized as the algorithm DeleteVertex_LL_DG for deleting any vertex from a digraph.

Algorithm DeleteVertex_LL_DG

Input: DGptr, the pointer to the graph. V_x, the label of the vertex which has to be removed from the graph.

Let N be the number of vertices presently available in the graph.

Output: The reduced graph without the vertex V_x and its associated edges.

Data structure: Linked structure of undirected graph and DGptr is the pointer to it.

Steps:

1.	If (N = 0) then
2.		Print “Graph is empty: No deletion”
3.		Exit // Exit the program
4.	EndIf
5.	ptr = DGptr[V_x]→LINK // Pointer to the adjacency list of V_x
6.	DGptr[V_x] →LINK = NULL // Remove adjacency list of vertex V_x
7.	DGptr[V_x] →Label = NULL // Remove V_x from the list of vertices
8.	N = N – 1
9.	ReturnNode(ptr) // Return the block to memory manager
	/ For V_x in the adjacency list, deleting all existing vertices /
10.	For i = 1 to N do
11.		DeleteAny_SL(DGptr[i], V_x)
12.	EndFor
13.	Stop

After the discussion of deleting a vertex from a graph, now we shall discuss how to delete an existing edge from a graph.

Suppose, we want to delete an edge between the two nodes v_i and v_j in an undirected graph. This means that we have to delete the node having label v_j in the adjacency list of v_i as well as the node having label v_i in the adjacency list of v_j. In contrast, if this edge is in the digraph and the edge is directed from v_i to v_j then we have to delete the node labelled as v_j in the adjacency list of v_i only. These two cases are described separately in the algorithms DeleteEdge_LL_UG and DeleteEdge_LL_DG as stated below:

Algorithm DeleteEdge_LL_UG

Input: UGptr, the pointer to the graph. <V_i, V_j>, the edge to be deleted between vertices V_i and V_j.

Output: The graph without edge between V_i and V_j.

Steps:

1.	Let N = number of vertices in the graph
2.	If (V_i > N) or (V_j > N) then
3.		Print “Vertex does not exist: Error in edge removal”
4.	Else // Delete V_i and V_j from the adjacency list of V_j and V_i, respectively
5.		DeleteAny_SL (UGptr[V_i], V_j) // Delete V_j from the adjacency list of V_i
6.		DeleteAny_SL(UGptr[V_j],V_i) // Delete V_i from the adjacency list of V_j
7.	EndIf
8.	Stop

Now, let us describe the algorithm DeleteEdge_LL_DG to delete an edge from a digraph.

Algorithm DeleteEdge_LL_DG

Input: DGptr, the pointer to the graph. <V_i, V_j>, the edge to be deleted from vertex V_i to V_j.

Output: The graph without edge from vertex V_i to V_j.

Steps:

1.	Let N = number of vertices in the graph
2.	If (V_i > N) or (V_j > N) then
3.		Print “Vertex does not exist: Error in edge removal”
4.	Else
5.		DeleteAny_SL (DGptr[V_i], V_j) // Delete V_j from the adjacency list of V_i
6.	EndIf
7.	Stop

Graph traversals

Traversing a graph means visiting all the vertices in the graph exactly once. For the sake of simplicity, we will assume that the graph is connected.

Several methods are known to traverse a graph systematically, out of them two methods are accepted as standard and will be discussed in details in this section. These methods are called breadth first search (BFS) and depth first search (DFS). With these traversals, starting from a given node we can visit all the nodes which are reachable from that starting node.

Depth first search (DFS) traversal is similar to the inorder traversal of a binary tree. Starting from a given node, this traversal visits all the nodes up to the deepest level and so on.

Figure 11.18 shows the DFS traversals on two graphs G1 and G2 starting from the vertex v₁. The path of traversals are indicated with thick lines. The sequence of visiting of the vertices can be obtained as:

DFS(G1) = v₁-v₂-v₅-v₇-v₄-v₈-v₆-v₃

DFS(G2) = v₁-v₂-v₅-v₇-v₄-v₈-v₃-v₆

From the above traversals, the traversals take place up to the deepest level, for example,
v₁-v₂ -v₅-v₇, then v₄-v₈ and v₆-v₃ (in G1), v₃-v₆ (in G2). It can be noted that the sequence of visit depends on the depth we choose first and hence the order of visiting the vertices may not be unique.

Fig. 11.18 DFS traversal on two graphs G1 (undirected) and G2 (directed).

Another standard graph traversal method is the breadth first search (BFS). This traversal is very similar to the level-by-level traversal of a tree. Here, any vertex in the level i will be visited only after the visit of all the vertices in its preceding level, that is, at i – 1. BFS traversal is roughly analogous to the preorder traversal of a tree. Here, suppose we are to visit the vertex v_i and v_i has v_i₁, v_i₂, ..., v_in as the adjacent vertices of it. Then the BFS traversal can be defined recursively as stated below:

Traverse(v_i)

Process(v_i)

Traverse(v_i₁)

Traverse(v_i₂)

. . .

Traverse(v_in)

End of traversal (v_i)

The BFS traversals on two graphs G1 and G2 starting from the vertex v₁ are shown as in
Figure 11.19.

From Figure 8.19 the following order of visit of vertices with the BFS traversals can easily be revealed:

BFS(G1) = v₁-v₂-v₈-v₃-v₅-v₄-v₆-v₇

BFS(G2) = v₁-v₂-v₃-v₅-v₄-v₆-v₇-v₈

It can be noted that, in G1 (undirected graph), v₁ is in the first level and visited first, then v₂, v₃ and v₈ are visited which are in the same level and next to v₁; similarly v₄, v₅ and v₆ and so on. In G2 (digraph), v₂ and v₃ are in same level, v₄, v₅ and v₆ are in one level; again v₇, v₈ are in one level.

Fig. 11.19 BFS traversal on two graphs G1 (undirected) and G2 (directed).

Notes:

1. DFS and BFS traversals result an acyclic graph.

2. DFS traversal and BFS traversal on the same graph do not give the same order of visit of vertices, or more precisely they generally result in different acyclic graphs.

3. DFS or BFS traversals can be employed to decide whether there is a path from a given vertex v_i to another vertex v_j and if it exists then traces the path from v_i to v_j.

4. DFS and BFS traversals cannot visit a vertex v_i say, if there is no path from starting vertex to that v_i.

In the next two sections, we will describe the details of the algorithms to implement the above mentioned traversals on any graph.

DFS traversal

It is already pointed that the DFS traversal is similar to the inorder traversal of a tree. The general approach behind DFS traversal beginning at vertex v is that visit the vertex v first. Then visit all the vertices along a path which begins at v. This means that, visit the vertex v then the vertex immediate adjacent to v, let it be v_x. Next, if v_x has an immediate adjacent v_y say, then visit it and so on, till there is a ‘dead end’. This results a path P, v-v_x-v_y ... Dead end means a vertex which do not have immediate adjacent or its immediate adjacent already been visited. After coming to a ‘dead end’ we backtrack along P to v to see if it has another adjacent vertex other than v_xand then continue the same from it else from the adjacent of the adjacent (which is not visited earlier), and so on.

A stack can be used to maintain the track of all paths from any vertex so as to help backtracking. Initially the starting vertex will be PUSHed onto the stack (let the name of the stack be OPEN). To visit a vertex, we are to POP a vertex from OPEN, and then PUSH all the adjacent vertices onto it. A list, VISIT can be maintained to store the vertices already visited. When a vertex is popped, whether it is already visited or not that can be known by searching the list VISIT; if the vertex is already visited, we will simply ignore it and we will POP the stack for the next vertex to be visited. This procedure will be continued till the stack is not empty. The above idea is expressed more precisely as below:

Algorithm DFS (informal description)

1.	Push the starting vertex into the stack OPEN
2.	While OPEN is not empty do
3.		POP a vertex V
4.		If V is not in VISIT
5.			Visit the vertex V
6.			Store V in VISIT
7.			Push all the adjacent vertex of V onto OPEN
8.		EndIf
9.	EndWhile
10.	Stop

The detailed version of the above algorithm, when the given graph is represented using linked lists, is stated as in algorithm DFS_LL.

Algorithm DFS_LL

Input: V, the starting vertex

Output: A list VISIT giving the order of visit of vertices during traversal.

Data structure: Linked structure of graph. GPTR is the pointer to a graph.

Steps:

1.	If Gptr = NULL then
2.		Print “Graph is empty”
3.		Exit
4.	EndIf
5.	u = V // Start from V
6.	OPEN.PUSH(u) // Push the starting vertex into OPEN
7.	While (OPEN.TOP ¹ NULL) do // Till the stack is not empty
8.		u = OPEN.POP( ) // Pop the top element from OPEN
9.		If (Search_SL(VISIT, u) = FALSE) then // If u is not in VISIT
10.			InsertEnd_SL(VISIT, u) // Store u in VISIT
11.			ptr = GPTR[u] // To push all the adjacent vertices of u into OPEN
12.			While (ptr→LINK ¹ NULL) do
13.				vptr = ptr→LINK
14.				OPEN.PUSH (vptr→LABEL)
15.			EndWhile
16.		EndIf
17.	EndWhile
18.	Return(VISIT)
19.	Stop

BFS traversal

Now, let us describe the breadth first search (BFS) on a graph. The implementation idea of the BFS traversal is almost same as the DFS traversal except that in BFS we will use a queue structure instead of a stack structure as in DFS. Let us denote the queue as OPENQ to use it in BFS method. VISIT is the list to store the ordering of vertices during the BFS traversal. The detailed version of the BFS traversal is stated in the algorithm BFS_LL as stated below.

Algorithm BFS_LL

Input: V is the starting vertex.

Output: A list VISIT giving the order of visit of vertices during the traversal.

Data structure: Linked structure of graph. Gptr is the pointer to a graph.

Steps:

1.	If (GPTR = NULL) then
2.		Print “Graph is empty”
3.		Exit
4.	EndIf
5.	u = V
6.	OPEN.ENQUEUE(u) // Enter the starting vertex into OPENQ
7.	While (OPENQ.STATUS( ) ¹ EMPTY) do // Till the OPENQ is not empty
8.		u = OPENQ.DEQUEUE ( ) // Delete the item from OPENQ
9.		If (Search_SL(VISIT, u) = FALSE) then // If u is not in VISIT then visit it
10.			InsertEnd_SL(VISIT, u) // Store u in VISIT
11.			ptr = Gptr[u]
12.			While (ptr→LINK ¹ NULL) do // To enter all the adjacent vertex of v into OPENQ
13.				vptr = ptr→LINK
14.				OPENQ.ENQUEUE(vptr→LABEL)
15.			EndWhile
16.		EndIf
17.	EndWhile
18.	Return (VISIT)
19.	Stop

In the algorithm DFS_LL, we have used stack. Similarly, queue is used in the algorithm BFS_LL. Usual operations on single linked list are discussed in Module 5.

Merging two graphs

Consider two graphs G1 and G2. By merging we can combine these two graphs into a single component. This can be accomplished by establishing one or more edges between the vertices in G1 and G2. Figure 11.21 illustrates the merging operation between two graphs G1 and G2.

Fig. 11.21 Merging of two graphs G1 and G2.

Let us define this operation such that it will merge two graphs G1 and G2 over a given set of edges and return a new graph G. This can be done by copying all the adjacency list from G1 and G2 to the adjacency list of G and then establishing edges among the given vertices. Following is the algorithm for merging two undirected graphs:

Algorithm Merge_UG_LL

Input: UG1ptr, the pointer to a graph G1, UG2ptr, the pointer to another graph G2. S = Set of edges

from G1 to G2. Let N₁, N₂ are the number of vertices in the graphs G1 and G2, respectively.

Output: UGptr, the pointer to the resultant graph after merging.

Data structure: Linked structure of graph.

Steps:

1.	For i = 1 to N₁ do // To copy the adjacency lists of all the vertices in G1 to G
2.		UGptr[i]→LINK = UG1ptr[i] →LINK
3.	EndFor
4.	For i = 1 to N₂ do // To copy the adjacency lists of all the vertices in G2 to G
5.		UGptr[N₁+ i] →LINK = UG2ptr[i] →LINK
6.	EndFor
7.	N = N₁+ N₂// Number of vertices in the resultant graph G
	/ To set the edge(s) from vertex in G1 to vertex in G2 in G /
8.	While (S ¹ NULL) do
9.		Read (v, w) // Read a vertex v in G1 and w in G2 from the set S
10.		If (v < N₁) and (w < N₂) then // To see if these vertices are in G1 and G2 or not
11.			InsertEdge_UG_LL (UGptr, v, N₁+ w) // Call the procedure of edge insertion
12.		EndIf
13.	EndWhile
14.	Return(UGptr) // Return the resultant graph
15.	Stop

Likewise, for merging any two digraphs, the algorithm Merge_DG can be defined. Here, only Step 6 is to be modified. It is left as an exercise.

Operations on Matrix Representation of Graphs

So far we have discussed how to define various operations on graphs if they are maintained using linked list. Here, we have pointed out that matrix representation of graphs is more advantageous in several respects over linked list representation. In this section, we will discuss how the various operations like insertion, deletion, traversal, merging, etc., can be carried out in matrix representation of graph. It can be seen that, in this representation all these operations are computationally faster and more easier to describe also.

The essential data structure that will be assumed in the subsequent discussion is a 2D matrix of size n ´ n in order to store a graph of n vertices. As a matrix is based on static memory allocation, we will assume that the matrix under consideration is too large to store a graph of arbitrary large size n.

Insertion

How to insert a vertex into a graph is illustrated in Figure 11.22. To insert a vertex v₈, say, we have to add one row and one column. If the graph is undirected then transpose of row entry is same as that of the column. But for a digraph, these two entries are not necessarily the same. In this case, the main idea behind the insertion of a new vertex v_i is that if there is an edge from v_i to v_j then we have to enter a 1 in i-th row and j-th column.

Fig. 11.22 Insertion of a vertex into an undirected and a directed graph.

The above strategies are generalized as the algorithm InsertVertex_AM_UG and InsertVertex_AM_DG, to insert a vertex into an undirected graph and directed graph, respectively.

Algorithm InsertVertex_AM_UG

Input: V_x, the vertex that has to be inserted.

X = [V_x₁, V_x₂, ..., V_x_l], is l adjacent vertices to the vertex V_x.

Let N = number of vertices currently present in the graph.

Output: A graph with a new vertex V_x and its adjacent edges with V_x_i, i = 1, 2, ..., l, if adjacent vertices exist.

Data structure: Matrix representation of graph. UGptr, the pointer to the undirected graph.

Steps:

1.	N = N + 1, V_x = N // Number of vertices is increased by l and label of the new vertex is N
2.	For i = 1 to V_x do // Insert a row and a column for V_x
3.		UGptr[V_x][i] = 0 // Row vector is initialized
4.		UGptr[i][V_x] = 0 // Column vector is initialized
5.	EndFor
	/ To add the adjacency list of the new vertex V_x in the graph /
6.	For i = 1 to l do
7.		Let j = X[i] // j is the label of i-th adjacent vertex
8.		If (j ³ N) then
9.			Print “No vertex labelled X[i] does exist. Edge from V_x to X[i] is not established.”
10.		Else
11.			UGptr[V_x][j] = 1 // Add entry from V_x to X[i]
12.			UGptr[j][V_x] = 1 // Add entry from X[i] to V_x
13.		EndIf
14.	EndFor
15.	Stop

Now, let us describe the InsertVertex_AM_DG to insert a vertex and its associated edge(s) into a digraph.

Algorithm InsertVertex_AM_DG

Input: V_x, the new vertex that has to be inserted.

X = [V_x₁, V_x₂, ..., V_xm], the list of adjacent vertices that has edges from V_x to V_xi, i = 1, 2, ..., m.

Y = [V_y₁, V_y₂, ..., V_yn], the list of adjacent vertex that has edges from V_yi (i = 1, 2, ..., n) to V_x.

Let N be the number of vertices currently present in the graph.

Output: A graph with new vertex v_x and directed edges from v_x to v_xi, i = 1, 2, ..., and from v_yi, i = 1, 2, ..., n to V_x, if such V_xi and V_yi exist.

Data structure: Matrix representation of graph. DGptr, the pointer to the directed graph.

Steps:

1.	N = N + 1, V_x = N // New vertex is counted in the graph
	/* To add a row and a column for V_x and initialized it */
2.	For i = 1 to V_x do
3.		DGptr[V_x][i] = 0 // Row vector is initialized
4.		DGptr[i][V_x] = 0 // Column vector is initialized
5.	EndFor
	/To set the edge from V_x to V_xi, i = 1, 2, ..., m/
6.	For i = 1 to m do
7.		Let j = X[i] // j is the label of i-th adjacent vertex
8.		If (j ³ N) then
9.			Print “No vertex labelled X[i] does exist.Edge from V_x to X[i] is not established”
10.	Else // Set the edge from V_x to V_xi
11.		DGptr[V_x][j] = 1
12.	EndIf
13.	EndFor
	/* To set the edge from V_yi, l = 1, 2, ..., n to V_x*/
14.	For i = 1 to n do
15.		j = Y[i] // j is the label of i-th adjacent vertex of V_x
16.		If (j ³ N) then
17.			Print “No vertex labelled X[i] does not exist. Edge from Y[i] to V_x is not established”
18.		Else // Set the edge from V_yi to V_x
19.			DGptr[j][V_x] = 1
20.		EndIf
21.	EndFor
22.	Stop

One can easily notice the simplicity of these two algorithms in comparison to the same algorithms for the graph with linked list representation.

Inserting an edge is also very simple if the graph is represented by adjacency matrix. Suppose, we want to add an edge between v_i and v_j, the two existing vertices in the graph. In case of undirected graph, we must set both the entries at i-th row, j-th column as well as at j-th row and i-th column with 1’s. On the other hand, in case of a directed graph, for the edge- directed from v_i to v_j, we have to set the entries at i-th row, j-th column only. These operations are formulated as in the algorithms InsertEdge_AM_UG and InsertEdge_AM_DG as stated below:

Algorithm InserEdge_AM_UG

Input: <V_i, V_j>, the edge between the vertices V_i and V_j is to be inserted.

Output: The graph with edge added between V_i and V_j.

Data structure: Matrix representation of graph. UGptr, the pointer to the undirected graph.

Steps:

1.	Let n = number of the vertices in the graph.
2.	If (V_i > n) or (V_j > n) then
3.		Print “Edge is not possible between V_i and V_j”
4.	Else
5.		UGptr[V_i][V_j] = 1 // Set (i, j) entry
6.		UGptr[V_j][V_j] = 1 // Set (j, i) entry
7.	EndIf
8.	Stop

Algorithm InsertEdge_AM_DG

Input: DGptr, the pointer to the graph. <V_i, V_j>, the edge from V_i to V_j to be inserted.

Output: The graph with edge added between V_i and V_j.

Data structure: Matrix representation of graph. DGptr, the pointer to the directed graph.

Steps:

1.	Let n = number of the vertices in the graph.
2.	If (V_i > n) or (V_j > n) then
3.		Print “Edge is not possible between V_i and V_j”
4.	Else
5.		DGptr[V_i][V_j] = 1 // Set (i, j) entry
7.	EndIf
8.	Stop

Deletion

Next operations that we will discuss is the deletion of a vertex and deletion of an edge from a graph. First, let us consider the deletion of a vertex from a graph. This can be easily accomplished by removing the i-th row and i-th column for the vertex having label i. This is irrespective of whether it is directed or undirected graph. The algorithm DeleteVertex_AM is described as below:

Algorithm DeleteVertex_AM

Input: V_x, the vertex that has to be removed. Let N be the total number of vertices in the graph.

Output: The reduced graph without the vertex V_x and associated edge(s) with it.

Data structure: Matrix representation of graph. Gptr, the pointer to the graph.

Steps:

1.	If N = 0 then
2.		Print “Graph is empty: No deletion”
3.		Exit
4.	EndIf
5.	If (V_x > N) then
6.		Print “Vertex does not exist”
7.		Exit
8.	EndIf
	/ To remove the row and column of V_x vertex /
9.	Let j = V_x
10.	For i = j to N – 1 do
11.		For k = 1 to N do
12.			Gptr[k][i] = Gptr[k][i + 1] // Shift all the columns after V_x towards left once
13.		EndFor
14.	EndFor
15.	For i = j to N – 1 do
16.		For k = 1 to N do
17.			Gptr[i][k] = Gptr[i + 1][k] // Move all the rows after V_x towards up once
18.		EndFor
19.	EndFor
20.	N = N – 1 // The vertex V_x is deleted
21.	Stop

The deletion of an edge from a graph is also straightforward. To delete an edge <V_i, V_j> from the undirected graph, we must reset the entries at (i, j) and (j, i) both, in the adjacency matrix by 0’s. In case of directed graph, if there is the edge from V_i to V_j then we have to reset the entry at (i, j) of the matrix only. The algorithm DeleteEdgex_AM is described as below to delete an edge from any graph.

Algorithm DeleteEdge_AM

Input: <V_i, V_j>, the edge to be deleted between the vertices V_i, V_j.

Output: The graph without edge between V_i and V_j.

Data structure: Matrix representation of graph. Gptr, the pointer to graph.

Steps:

1.	Let n = number of vertices in the graph.
2.	If (V_i > n) or (V_j > n) then
3.		Print “Vertex does not exist: Error in edge removal”
4.	Else
5.		Case: Undirected graph
6.			Gptr[V_i][V_j] = 0
7.			Gptr[V_j][V_i] = 0
8.		Case: Directed graph
9.			Gptr[V_i][V_j] = 0
10.	EndIf
11.	Stop

Traversals

We know there are two graph traversal methods: DFS and BFS, and we have learnt how these two methods can be implemented on graphs which are represented using linked lists. In this section, we are to describe the realization of same methods when graphs are stored in adjacency matrices.

Let us first consider the DFS traversal on a graph which is represented with an adjacency matrix. Following is the algorithm DFS_AM for the purpose.

Algorithm DFS_AM

Input: V = the starting vertex. Let N be the number of vertices in the graph

Output: An array VISIT giving the order of visit of vertices during traversal.

Data structure: A stack OPEN to hold the vertices which is initially empty. Matrix representation of graph

with Gptr is the pointer to it.

Steps:

1.	If Gptr = NULL then
2.		Print “Graph is empty”
3.		Exit
4.	EndIf
5.	u = V
6.	OPEN.PUSH(u) // Push the starting vertex into OPEN
7.	While (OPEN.TOP ¹ NULL) do // Till the stack is not empty
8.		u = OPEN.POP( ) // POP the top element from OPEN
9.		If (SearchArray(VISIT, u) = FALSE) then // If v is not in the array VISIT
10.			InsertArray(VISIT, u) // Store v in VISIT
11.			For i = 1 to N do // Push all the adjacent vertex of v into OPEN
12.				If (Gptr[v][i] = 1) then
13.					OPEN.PUSH(i)
14.				EndIf
15.			EndFor
16.		EndIf
17.	EndWhile
18.	Return(VISIT)
19.	Stop

Now let us consider the BFS traversal on a graph represented with adjacency matrix. The algorithm as stated below is an attempt to implement BFS traversal.

Algorithm BFS_AM

Input: V, the starting vertex. Let N be the number of vertices in the graph.

Output: An array to store the order of visit of vertices during traversal.

Data structure: A queue OPENQ to hold the vertices which is initially empty. Matrix

representation of graph with Gptr is the pointer to it.

Steps:

1.	If (Gptr = NULL) then
2.		Print “Graph is empty”
3.		Exit
4.	EndIf
5.	u = V
6.	OPENQ.ENQUEUE(u) // Enter the starting vertex into the queue OPENQ
7.	While (OPENQ.STATUS( ) ¹ EMPTY) do // Till the OPENQ is not empty
8.		v = OPENQ.DEQUEUE( ) // Delete the item from OPENQ
9.		If (SearchArray(VISIT, u) = FALSE) then // If v is not in the array VISIT
10.			InsertArray(VISIT, u) // Store visited node u in VISIT
11.			For i = 1 to N do // To enter all the adjacency vertices of v into OPENQ
12.				If (Gptr[u][i] = 1) then
13.					OPENQ.ENQUEUE(i)
14.				EndIf
15.			EndFor
16.		EndIf
17.	EndWhile
18.	Return(VISIT)
19.	Stop

In the above stated two algorithms, we have used few operations on array structure, namely, INSERT and SEQ_SEARCH which are already discussed in Chapter 2 and ENQUEUE( ), DEQUEUE( ) in Chapter 5.

It can be noted that, in these algorithms we have used an array VISIT to store the traversal order of visited vertices and then path of traversal can easily be traced from it.

Merging of two graphs

We have already discussed the merging of two similar type of graphs when they are stored using linked list. Here also, the same case will be discussed but for the graphs stored as adjacency matrices. In the said section, we have studied for undirected graphs. Here, we will study about merging of directed graphs. Figure 11.24 illustrates the two graphs G1 and G2 and the resultant graph G after merging both the graphs. Related adjacency matrices are also displayed.

Fig. 11.24 Merging of two graphs in adjacency matrix representation.

If we carefully observe the adjacency matrix of the resultant graph G in Figure 11.23(b), we can find that the entire adjacency matrices of G1 and G2 are copied to G, as shown in the shaded parts. In addition to that, there are few new entries correspond to the new edges. So merging of two graphs can be accomplished by copying the adjacency matrices of the components and then setting of few entries for new edges. The method is formulated as the algorithm MERGE_DG_ AM.

Algorithm Merge_DG_AM

Input: DG1ptr, the pointer to the digraph G1. DG2ptr, the pointer to the digraph G2.

S = Set of directed edges.

Let N₁ and N₂ are the numbers of vertices in the graphs G1 and G2 respectively.

Output: DGptr, the pointer to the resultant graph G.

Steps:

1.	N = N₁+ N₂
	/ To initialize all the entries of the matrix of graph G /
2.	For i = 1 to N do
3.		For j = 1 to N do
4.			DGptr[i][j] = 0
5.		EndFor
6.	EndFor
	/ To copy the adjacency matrix of G1 in the adjacency matrix of G /
7.	For i = 1 to N₁ do
8.		For j = 1 to N₁ do
9.			DGptr[i][j] = DG1ptr[i][j]
10.		EndFor
11.	EndFor
	/ To copy the adjacency matrix of G2 in the adjacency matrix of G /
12.	For i = 1 to N₂ do
13.		For j = 1 to N₂ do
14.			DGptr[i + N₁] [j + N₁] = DG2ptr[i][j]
15.		EndFor
16.	EndFor
	/ To set the edges from G1 to G2 /
17.	doFlag = TRUE // Continue till there is an edge from G1 to G2
18.	While (doFlag) do
19.		Read (V, W) // Read an edge <V, W>: V in G1 and W in G2
20.		If (V < N₁) and (W < N₂) then // If they are valid vertices in G1 and G2
21.			DGptr[V][N₁ + W] = 1 // Add the edge <V, W>
22.		EndIf
23.		Read (doFlag) // Read the value for doFlag in order to continue or not
24.	EndWhile
	/ To set the edge from G2 to G1 /
25.	doFlag = TRUE // Continue till there is a edge from G2 to G1
26.	While (doFlag) do
27.		Read (W, V) // Read an edge <V, W>: V in G1 and W in G2
28.		If (V < N₁) and (W < N₂) then // If they are valid vertices in G1 and G2
29.			DGptr[N₁+ W][V] = 1 // Add the edge <W, V>
30.		EndIf
31.		Read (doFlag) // Read the value for doFlag in order to continue or not
32.	EndWhile
33.	Stop

11e can conclude that the steps involved in case of graph operations when the graphs are represented with adjacency matrices are simpler than that with the linked lists representation where overhead of list manipulation is involved. Note that the algorithms in both cases are designed in such a way that steps involved remain almost the same. During our subsequent discussions, we will use the adjacency matrix representation of graphs in most cases. (Reader can easily convert the use of adjacency representation to linked list representation or vice versa.)

Application of Graph Structures

Graph is an important data structure whose extensive applications are known in almost all application areas. Nowadays, many applications related with computation can be managed efficiently with graph structures. As for examples, consider two simple problems where graph structures can be utilized to solve the problems.

Transportation problem

This is a well-known problem in shipping goods. Suppose, there are several warehouses, which are distributed over geographically distant locations. It is required to transport goods from a given warehouse to another. The various paths possible among the warehouses are as shown in Figure 11.25. Assuming that the cost of transportation for all paths are known, then the problem is to find a path of transportation from a warehouse, say A, to another warehouse, say B, so that the cost of transportation is minimum. This problem can easily be represented using graph structure where each warehouse can be thought as vertex and edge as the path and weight of an edge is the cost of transportation involved. In this problem, therefore, if several paths are known from A to B, then it is required to find a path where the sum of the weights are minimum. How this problem, can be accommodated to a particular graph theory problem and hence can be solved, will be discussed shortly.

Fig. 11.25 Graph of transportation network.

Map colouring

Let us consider another interesting problem—colouring of map. We have to colour a map so that no two adjacent regions have the same colour. This problem can be presented with the help of a graph; each region can be represented as vertex, and if two regions are adjacent we can represent this by an edge between the two vertices which represent two regions. Figure 8.26 illustrates such a map and its representation with graph structures. Now the problem can be posed as to colour all the vertices so that no two adjacent vertices are of the same colour. In other words, whether a given graph can be coloured with this property or not, and if possible then with minimum how many colours we can do this. We will discuss all such kind of problems with the help of graph structure.

Fig. 11.26 Map and its corresponding graph.

It is not possible to discuss all the problems where graph structures are involved because this domain is very large; however, we limit our discussions with a few important of them which play significant roles in various applications.

In order to maintain the generality, we will discuss them as graph theoretic problems. Conversion of a particular problem into this general graph theoretic problem will rest on the reader. The major graph theoretic problems are listed as below:

1. Shortest path problem

2. Topological sorting of a graph

3. Spanning trees

4. Connectivity of a graph

5. Euler’s path and Hamiltonian p

6. Binary decision diagram.

We shall discuss all these problems separately in the following sub-sections.

Shortest Path Problem

This problem of a graph is about finding a path between two vertices in such a way that this path will satisfy some criteria of optimization. For example, for a non-weighted graph, the number of edges will be minimum and for weighted graph, the sum of weights on all edges in the path will be minimum. This problem has varieties of classical solutions. A few important algorithms are: Warshall’s algorithm, Floyd’s algorithm and Dijkstra’s algorithm.

Warshall’s algorithm

This is a classical algorithm by which we can determine whether there is a path from any vertex v_i to another vertex v_j either directly or through one or more intermediate vertices. In other words, we can test the reachability of all pairs of vertices in a graph. We have introduced the notion of path matrix for a graph. Also we have seen how the path matrix can be computed from the adjacency matrix, A, of a graph using Lemma 11.4. This computation involves the computation of A², A³, ..., Aⁿ and then B_n. The method is computationally not efficient at all. To compute the path matrix of a given graph another elegant method is Warshall’s algorithm. This algorithm treats the entries in the adjacency matrix as bit entries and perform AND (Ù), and OR (Ú)—the Boolean operations on them. The heart of the algorithm is a trio of loops, which operates very much like the loops in the classic algorithms for matrix multiplication.

Let us first state this algorithm and then we will elaborate how it works.

Algorithm Warshall

Input: A graph G whose pointer to its adjacency matrix is Gptr and vertices are labelled as 1, 2, ..., N; N

being the number of vertices in the graph.

Output: The path matrix P.

Data structure: Matrix representation of graph with pointer as Gptr.

Steps:

	/ Initialization of path matrix P with the adjacency matrix Gptr* /*
1.	For i = 1 to N do
2.		For j = 1 to N do
3.			P[i][j] = Gptr[i][j]
4.		EndFor
5.	EndFor
6.	For k = 1 to N do
7.		For i = 1 to N do
8.			For j = 1 to N do
9.				P[i][j] = P[i][j] Ú (P[i][k] Ù P[k][j])
10.			EndFor
11.		EndFor
12.	EndFor
13.	Return(P)
14.	Stop

The functionality of this algorithm can be understood very easily. In Step 1, we are to initialize the path matrix P with the adjacency matrix Gptr. This signifies that initially the path matrix maintains P[i][j] entries whether there is a direct path from v_i to v_j. In Step 3, the outermost loop will execute N times. For each k, it decides whether there is a path from v_i to v_j (for all i, j = 1, 2, ..., N) either directly or via k, i.e., from v_i to v_k and then from v_k to v_j. It therefore sets the P[i][j] entries to 0 or 1 accordingly.

As an illustration, let us consider a graph as shown in Figure 11.27(a). From the graph it is observed that there is a path from v₁ to v₄ via v₂. During the execution of step 3 in Warshall’s algorithm and when k = 2, we see that P[1][4] = P[1][4] Ú (P[1][2] Ù P[2][4]) = 0 Ú (1 Ù 1) = 1 thus enumerating the path from v₁ to v₄ via v₂. The final path matrix is obtained as shown in Figure 11.27(b). From this, it is seen that there is no path from the vertex v₄ to any other vertex (all row entries are zero). Also, we cannot reach to vertex v₁ from any other vertex in the graph (all column entries are zero). Other possible reachabilities can also be seen from this path matrix. Here, the vertex v₁ acts as a source vertex (one can reach to any vertex from it) and vertex v₄ as a sink vertex (that is, if we reach at this vertex then it cannot be moved to any other vertices). There is no cycle in the graph as all the diagonal entries in the path matrix are zero.

Fig. 11.26 Continued.

Fig. 8.27 Illustration of Warshall’s algorithm.

Floyd’s algorithm

It can be noted that the path matrix so obtained using Warshall’s algorithm shows the presence or absence of any path between a pair of vertices; it also detects the presence or absence of a cycle at any vertex. The Warshall’s algorithm does not take into account the weights of edges. If weights are to be taken into account and if we are interested in the length of the shortest path between any pair of vertices, then another classical solution is known, given by Robert Floyd and is called the Floyd’s algorithm. In fact, the basic structure of the Floyd’s algorithm is same as Warshall’s algorithm.

Let us first state the Floyd’s algorithm. To describe it, we will use two functions, namely, Min(x, y) and Combine(p₁, p₂). Min(x, y) returns the minimum value between x and y, and Combine(p₁, p₂) returns the concatenation of two paths p₁ and p₂ resulting a single path. For example, say p₁ = 1-2-3 and p₂ = 3-4, then Combine(p₁, p₂) will return p = 1-2-3-4. This algorithm produces two matrices: one stores the length of shortest paths between all pair of vertices and another stores the shortest paths between all pair of vertices, if exist. Following is the algorithm in detail.

Algorithm Floyd

Input: A graph G whose pointer to its weighted adjacency matrix is Gptr and vertices are labeled as 1, 2,

..., N; N being the number of vertices in the graph.

Output: Q is a matrix of order N ´ N containing the length of the shortest path between all pair of vertices. PATHS is a matrix of order N ´ N containing the string of the shortest length of paths between all pair of vertices.

Steps:

	/ To initialize the matrix Q and PATHS /
1.	For i = 1 to N do
2.		For j = 1 to N do
3.			If (Gptr[i][i] = 0) then // No path between v_i and v_j
4.				Q[i][j] = ¥ // Initialize with large number say, infinity
5.				PATHS[i][j] = NULL // Path is empty
6.			Else
7.				Q[i][j] = Gptr[i][j]
8.				P = Combine(i, j) // Initial path is i-j
9.				PATHS[i][j] = P // Store this path
10.			EndIf
11.		EndFor
12.	EndFor
	/ To compute all pairs of shortest paths /
13.	For k = 1 to N do
14.		For i = 1 to N do
15.			For j = 1 to N do
16.				Q[i][j] = Min(Q[i][j], Q[i][k] + Q[k][j])
17.				If (Q[i][k] + Q[k][j] < Q[i][j]) then // If alternate path is found
18.					p₁ = PATHS[i][k]
19.					p₂ = PATHS[k][j]
20.					PATHS[i][j] = Combine (p₁, p₂) // Store the altered path
21.				EndIf
22.			EndFor
23.		EndFor
24.	EndFor
25.	Return (Q, PATHS)
26.	Stop

One can easily notice the similarities of Warshall’s algorithm and Floyd’s algorithm. In the second algorithm, the statement in inner-most loop (that is, j-loop), namely, Q[i][j] = MIN(Q[i][j], Q[i][k] + Q[k][j]) is in fact, the modified version of P[i][j] = P[i][j] Ú (P[i][k] Ù P[k][j]), which appears in Warshall’s algorithm. In addition to this, there are some extra steps in Floyd’s algorithm. The Floyd’s algorithm has one underlying assumption that if there is no path from v_i to v_j then Gptr[i][j] = ¥. (Infinity is the largest possible positive value.) This is incorporated in our algorithm as stated above (see Step 1). Floyd’s algorithm not only computes the length of the shortest paths, it also enumerates the shortest paths between the vertices and keeps a track of it; this is accomplished as in inner-most j-loop in Step 15-22. It is better, if we illustrate the execution of this algorithm with an example. Let us consider a weighted digraph as shown in Figure 8.29 for the purpose.

The algorithm when traced, the cost matrix Q and PATHS as they appear are shown below. The new/altered entries are underlined. The cost matrix Q and its corresponding path matrix are shown side by side.

Fig. 11.29 A weighted graph and its adjacency matrix.

After initialization of Q and PATHS,

After k = 1

After k = 2,

After k = 3,

After k = 4,

After k = 5,

The outer-most loop (k-loop) in Step 13-24 is to explore the possibility of the shortest path through vertex k = 1, 2, ..., 5, if found, it updates the cost and alters the path otherwise retains previous value.

For the given graph, we found that: there exist paths (shortest) between every pair of vertices, and every vertex has their cycle. (Note: Floyd’s algorithm can also be used for non-weighted graphs. In that case, weight of edge should be taken as 11).

Dijkstra’s algorithm

Another kind of the shortest path problem is the single source shortest path problem. In this problem, there is a distinct vertex, called source vertex and it requires to find the shortest path from this source vertex to all other vertices. As an example, let us consider a simple graph as shown in Figure 11.30. The different shortest paths, assuming v₁ as the source vertex, are listed as below:

Shortest path Length of the shortest path

From v₁ to v₂ v₁–v₂1

From v₁ to v₃ v₁–v₂–v₃3

From v₁ to v₄ v₁–v₂–v₄4

From v₁ to v₅ v₁–v₂–v₃–v₅5

Fig. 11.30 A simple graph representing shortest path.

The algorithm to find such paths was first proposed by E. W. Dijkstra and popularly known as Dijkstra’s algorithm. In this section, we will describe this algorithm in a simplest way and then will talk about its potential applications.

Assume that all the vertices in the graph are labelled as 1, 2, ..., N and the graph is represented through an adjacency matrix. Dijkstra’s algorithm requires three arrays as below:

LENGTH[1 ... N] = array of distances

PATH[1 ... N] = array of vertices

SET[1 ... N] = array of Boolean tags

In conclusion, the shortest distance from the source vertex to any vertex say, i(i = 1, 2,
..., N) is stored in LENGTH[i], while PATH[i] contains the nearest predecessor(s) of the vertex i on the path determining the shortest distance from source vertex to the vertex i. In other words, the array PATH keeps a track of shortest path from any vertex to the source vertex. The Boolean array SET is used during the execution of the algorithm. SET[i] = 1 means the shortest distance, and path from the source vertex to the vertex i is already enumerated.

Suppose the source vertex be S, the algorithm consists of two major parts: an initialization part followed by an iteration part. In initialization part, we are to initialize the above mentioned three arrays. Once the initialization is over, we are to enter into iteration part, where the vertices will be included one by one in the set of vertices for which the shortest distance from S is enumerated. The detail steps of the algorithm is stated as below:

Algorithm Dijkstra(S)

Input: Gptr, the pointer to the graph. S, the source vertex. Let N be the number of vertices.

Output: LENGTH, an array of distances from S to all other vertices. PATH, an array of string of vertices

giving the track of all shortest paths.

Data structure: Matrix representation of graph with Gptr as the pointer to it.

Steps:

	/ INITIALIZATION /
1.	For i = 1 to N do
2.		SET[i] = 0 // Initialization of SET array
3.	EndFor
4.	For i = 1 to N do
5.		If Gptr[S][i] = 0 then // There is no direct path from S to vertex i
6.			LENGTH[i] = ¥
7.			PATH[i] = NULL//Empty path
8.		Else
9.			LENGTH[i] = Gptr[S][i]
10.			PATH[i] = S // Source vertex is the immediate predecessor to vertex i
11.		EndIf
12.	EndFor
13.	SET[S] = 1
14.	LENGTH[S] = 0 // Source vertex is implicitly enumerated with length as 0
	/ITERATION/
15.	complete = FALSE // It is a flag for controlling the iteration
16.	While (not complete) do
17.		j = SearchMin(LENGTH, SET) // Find a vertex j which has minimum distance among // those vertices yet not enumerated for the shortest path
18.		SET[j] = 1 // Vertex j is enumerated
19.		For i = 1 to N do // For each i not yet enumerated
20.			If SET[i] = 1 then // If i is enumerated already
21.				i = i + 1 // then go to next vertex
22.			Else // If i is connected to j by an edge
23.				If Gptr[i][j] ¹ 0 then
24.					If ((LENGTH[j] + GPTR[i][j]) < LENGTH[i]) then
25.						LENGTH[i] = LENGTH[j] + Gptr[i][j]
26.						PATH[i] = j // Vertex j becomes the immediate predecessor of i
27.					EndIf
28.				EndIf
29.			EndIf
30.		EndFor
	/ To test whether all vertices are enumerated or not /
31.		complete = TRUE
32.		For i = 1 to N do
33.			If SET[i] = 0 then
34.				complete = FALSE
35.				Break // Break the loop
36.			Else
37.				i = i + 1
38.			EndIf
39.		EndFor
40.	EndWhile
41.	Return (LENGTH, PATH)
42.	Stop

In this algorithm, we have assumed a procedure SearchMin (see in Step 17). This procedure will return the label of the vertex which has minimum distance (by consulting the array length) and which is not yet enumerated in the shortest path (by consulting the array SET). The actual code of this procedure is very simple and is left as an exercise.

To understand Dijkstra’s algorithm, it is better to trace it with a graph. Let us consider the graph as shown in Figure 11.31.

Fig. 11.31 A graph for Dijkstra’s algorithm.

The result of the application of Dijkstra’s algorithm on the graph (Figure 8.31) is illustrated as shown in Figure 11.32. The entries in the arrays LENGTH and PATH, during the execution of various steps of Dijkstra’s algorithms, are illustrated step-by-step. Following information can be concluded from these two arrays.

Path Length Shortest path

From (Source vertex) To

1 2 1 1–2

3 3 1–2–3

4 4 1–2–4

5 5 1–2–3–5

Fig. 8.32 Illustration of Dijkstra’s algorithm.

The shortest path from the array PATH can be obtained by backward movement. For example, for the vertex 5, its immediate predecessor is 3 (at PATH[5]), immediate predecessor of vertex 3 is 2 (at PATH[3]), immediate predecessor of vertex 2 is 1 (at PATH[2]), and vertex 1 is the source vertex. Thus, the shortest path for vertex 5 from the source vertex 1 is 1–2–3–5, and length of this shortest path is 5.

We have discussed about the single source shortest paths problem and its solution due to Dijkstra’s algorithm. We have illustrated the algorithm with a weighted undirected graph. The algorithm is fairly applicable to solve the problem even if the graph is directed weighted, directed non-weighted, and non-weighted undirected, that is irrespective of the type of the graph. In case of non-weighted graph, weight of each edge should be taken as 1.

Assignment 8.15
Show how Dijkstra’s algorithm works on each of the graph as shown in Figure 8.33. Source vertices are denoted by thick circles.


Fig. 8.33

Assignment 8.16

There is another way to solve the single source shortest path problem. This approach is very similar to the breadth first search (BFS); the modification that is required is to use a priority queue instead of ordinary queue as used in BFS method. The priority of a vertex in the queue will be determined by the shortest path length of its predecessor plus the arc length (weight) from the predecessor to the vertex.

Write an algorithm based on the above mentioned approach.

Assignment 8.17
Figure 8.34 gives a road map connecting various places in a city and the cost of the tickets. Find the most economical route from place X to Y.



Fig. 8.34 A road map.

Minimum Spanning Trees

The next problem in the graph theory we are going to consider is minimum spanning tree problem. The spanning tree of a graph G can be defined as a tree which includes all the vertices of G. In Section 8.4.1, during the discussion of graph traversal, we have seen that the DFS and BFS traversals on graph result two trees: DFS-spanning tree and BFS-spanning tree. In Figure 8.37, DFS-spanning and BFS-spanning trees for a given graph G are shown. The above two spanning trees are obtained by DFS and BFS traversals starting from vertex v₁on G, respectively. It can be noted that starting with different vertex different spanning trees are also possible with the use of these traversals.

Fig. 8.37 Two spanning trees.

In this example, we have used a simple, undirected and non-weighted graph. The concept can be extended in a similar way for other graphs also. It can be noted that if the graph is connected then it always has minimum spanning tree(s). Minimum spanning tree problem is related to the weighted graph, where we find a spanning tree so that sum of all the weights of all the edges in the tree is minimum.

There are several methods available for finding minimum spanning tree of a graph. Out of them, two methods are known to be very efficient: (a) Kruskal’s algorithm and (b) Prim’s algorithm.

Kruskal’s algorithm

To obtain a minimum spanning tree of a graph, the novel approach was devised by J.B. Kruskal and known as Kruskal’s algorithm. Let us assume an undirected weighted graph with n vertices, where initially the spanning tree is empty. This algorithm then can be informally stated as below:

Algorithm Kruskal // Informal description //

1. List all the edges of the graph G in the increasing order of weights.

2. Select the smallest edge from the list and add it into the spanning tree (initially it is empty) if the inclusion of this edge does not make a cycle.

3. If the selected edge with smallest weight forms a cycle, remove it from the list.

4. Repeat step 2-3 until the tree contains n – 1 edges or list is empty.

5. If the tree T contains less than n – 1 edges and the list is empty, no spanning tree is possible for the graph, else return the minimum spanning tree T.

Let us consider an example for the illustration of the above algorithm. Figure 8.37 illustrates the algorithm for a graph G. Lists of edges with their weights are maintained in the form of a table. From this list, we have to select (ü) or reject (û) depending on whether it forms a cycle or not. The length of the spanning tree as obtained in Figure 8.38 can be calculated as 16.

Fig. 8.38 The minimum spanning tree using Kruskal’s algorithm.

Now, let us describe the Kruskal’s algorithm in detail. For the implementation of the algorithm, let us assume the data structure EDGE to maintain the list of edges, their weights and whether a particular edge is included in the tree or not. In the structure as shown in Figure 8.39, first two fields are to represent <v_i, v_j>, an edge. The field WEIGHT is the weight of an edge <v_i, v_j>, and SELECT is a Boolean field which store TRUE or FALSE to indicate if this edge is included or not in the minimum spanning tree.

Fig. 8.39 Structure of EDGE

We will assume the input graph in adjacency matrix representation. The spanning tree that will be produced by Kruskal’s algorithm, in fact, is a sub-graph of the graph and hence can be stored as another adjacency matrix. In addition to these, we will assume a method SORT_EDGES( ), which can be applied to the list of edges to sort the edges in the increasing order of their weights. To test whether, inclusion of an edge into the tree forms a cycle or not we can apply the algorithm WARSHALL which is already discussed in Section 8.5.1. With this, the algorithm KRUSKAL is described as below:

Algorithm KruskalSpanningTree

Input: Gptr, the pointer to the graph in adjacency matrix form. Vertices of the graph are labelled as 1, 2, ...,

N, N being the total number of vertices.

Output: TREE, the pointer to the minimum spanning tree in adjacency matrix form.

Data structure: An array X[1 ... N(N – 1)/2] of EDGE to store all the edges in the graph.

Steps:

	/ Initialization of array X /
1.	k = 1 // For the starting position of array X
2.	For i = 1 to N – 1 do
3.		For j = i + 1 to N do
4.			If Gptr[i][j] > 0 then
5.						X[k]→V_i = i
6.						X[k] →V_j = j
7.						X[k] →WEIGHT = Gptr[i][j]
8.						X[k] →SELECT = FALSE
9.						k = k + 1
10.			EndIf
11.		EndFor // j-loop
12.	EndFor // i-loop
	/ To decide whether the graph has spanning tree or not /
13.	ne = k // Number of edges
14.	If (ne < N – 1) then
15.		Print “No spanning tree possible in the graph”
16.		Exit // Stop the execution
17.	EndIf
18.	X.SortEdges( ) // Apply the sorting on X
	/ Initialize the adjacency matrix for spanning tree /
19.	For i = 1 to N do
20.		For j = 1 to N do
21.					TREE[i][j] = 0 // Initially the tree is empty
22.		EndFor // j-loop
23.	EndFor // i-loop
	/ Check edge one-by-one to include it into the TREE /
24.	k = 1, l = 1 // Start from the first edge
25.	While (k < N) do
26.		temp = TREE // Duplicate the TREE
27.		i = X[l] →V_i, j = X[l] →.V_j
28.		temp[i][j] = l, temp[j][i] = l // Temporarily included it into the tree
	/ To test whether it forms a cycle or not /
29.		Warshall(temp) // Apply Warshall’s method to obtain path matrix of temp
30.		flag = FALSE
31.		For p = 1 to N do
32.				If (temp[p][p] = 1) then //If a diagonal entry is 1 then inclusion of this edge forms cycle
33.							flag = TRUE
34.							Break // Break the loop
35.				EndIf
36.		EndFor
37.		If (not flag) then // If there is no cycle
38.				TREE[i][j] = 1, TREE[j][i] = 1 // Include the edge into the tree
39.				k = k + 1 // Edge count of the TREE
40.				X[l] →SELECT = TRUE
41.		EndIf
42.		l = l + 1 // Go to the next edge in the edge list
43.	EndWhile
44.	Return(TREE)
45.	Stop

Notes: Related to this algorithm, following points may be noted:

1. The size of the array X is N(N – 1)/2, this is because, the maximum number of edges in a simple graph with N vertices is N(N – 1)/2.

2. The information of all edges can be obtained from the upper triangular part (or lower triangular part, except diagonal element of the adjacency matrix of the graph. This is obvious if the graph is undirected. This is why step 2 has the loop-variables accordingly.

3. In step 5, we can take decision whether the graph has spanning tree or not. For a spanning tree with N vertices, there must be N – 1 edges. If the number of edges in the graph is less than N – 1, then no spanning tree is possible. This is possible when graph is not connected.

4. Before the application of Warshall’s algorithm, TREE must be duplicated into a temporary matrix, otherwise original TREE will get lost.

Assignment 8.19

1. Modify the algorithm KruskalSpanningTree to take into account the following considerations:

(a) Calculate the sum of the weights of the resultant spanning tree.

(b) Obtain the minimum spanning tree of any directed graph.

2. We have considered the spanning tree of minimum length. In some application, it desires to obtain a longest spanning tree, that is, a spanning tree where the sum of the weights of the edges in the tree is maximum.

How can the algorithm KruskalSpanningTree be modified to accomplish this? Verify your suggestion with a suitable example.

3. The diameter of a tree is defined as the maximum distance between any two vertices. For given a connected and undirected graph, devise an algorithm for finding a spanning tree of minimum diameter. Verify the correctness of your algorithm with an example.

Prim’s algorithm

This approach does not require listing of all edges in increasing order of weights or checking at each step that whether a newly selected edge forms a circuit or not. According to Prim’s algorithm, minimum spanning tree grows in successive stages: at any stage in the algorithm, we can see that we have a set of vertices that have already been included in the tree, the rest of the vertices have not. The prim’s algorithm then finds a new vertex to add it to the tree by choosing the edge <V_i, V_j>, the smallest among all edges, where V_i is in the tree and V_j is yet to be included in the tree. The algorithm starts by selecting a vertex arbitrarily, and then in each stage, we add an edge (by adding an associated vertex) to the tree.

The Prim’s algorithm can easily be implemented using the adjacency matrix representation of a graph. Suppose, there is an N ´ N adjacency matrix for a given undirected weighted graph, and vertices are labelled as v₁, v₂, ..., v_N. Start from vertex v₁, say. Get its nearest neighbour, that is, the vertex which has the smallest entry in row of v₁ (if more than one smallest entries are there, arbitrarily select any one); let it be v_i. Now consider v₁ and v_i as one sub-graph. Next, obtain the closest neighbour to this sub-graph, that is a vertex other than v₁ and v_i that has the smallest entry among all entries in the rows of v₁ and v_i. Let this new vertex be v_j. Therefore, the tree now grows with vertex v₁, v_i and v_j as one sub-graph. Continue this process, until all N vertices have been connected by N – 1 edges.

Let us now illustrate the above method of finding minimum spanning tree. An undirected weighted graph and its weighted adjacency matrix is shown in Figure 8.40(a). We start with v₁ and pick the smallest entry in row 1 which is 2 in column 4; thus v₄ is the nearest neighbour to v₁, The closest neighbour of sub-graph v₁ and v₄ is v₂, that can be seen by examining all the entries in row 1 and 4. The four remaining edges selected following the above procedure are <v₂, v₅>, <v₄, v₆>, (v₆, v₃) and (v₆, v₇). See Figure 8.40(b); the total length of the tree can be calculated as 16.

Fig. 8.40 Continued.

Fig. 8.40 Illustration of Prim’s algorithm.

Let us discuss the implementation of the Prim’s algorithm in details.

An array SELECTED[1... N] of Boolean will be used in our algorithm and SELECTED[i] = TRUE means that i-th vertex is included in the tree. The output of the minimum spanning tree will be stored in an adjacency matrix of order N ´ N. In the adjacency matrix of the given graph, we assume the (i, j) entry as infinity (a large positive value) if there is no edge between the vertices i and j. The algorithm PRIM is described as below.

Algorithm PrimSpanningTree

Input: Gptr, the pointer to the graph, in the form of adjacency matrix. Vertex are labelled as 1, 2, .., N; N

being the number of vertices in the graph.

Output: TREE, the pointer to the minimum spanning tree represented with adjacency matrix of order

N ´ N.

Data structure: An array SELECTED [1... N] of Boolean values.

Steps:

	/ Initializations /
1.	For i = 1 to N do
2.		SELECTED[i] = FALSE // Initially none of the vertices are selected
3.	EndFor
4.	For i = 1 to N do
5.		For j = 1 to N do
6.			TREE[i][j] = 0
7.		EndFor // j-loop
8.	EndFor //i-loop
	/ Select and include /
9.	SELECTED[1] = TRUE, ne = 1 // Select the vertex 1 as the starting vertex and number of edges in the tree
	/ To find the nearest neighbour vertex of one of the selected vertices /
10.	While (ne < N) do
11.		min = ¥
12.		For i = 1 to N do
13.			If (SELECTED[i] = TRUE) then
14.				For j = 1 to N do
15.					If (SELECTED[j] = FALSE) then
16.						If (min > Gptr[i][j]) then
17.							min = Gptr[i][j]
18.							x = i, y = j // Location of the nearest neighbour
19.						EndIf
20.					EndIf
21.				EndFor // j-loop
22.			EndIf
23.		EndFor
24.		TREE[x][y] = 1 // Include the nearest neighbour into the tree
25.		SELECTED[y] = TRUE
26.		ne = ne + 1
27.	EndWhile
28.	Return (TREE)
29.	Stop

Therefore, it can be observed that the Prim’s algorithm has less overhead than Kruskal’s algorithm.

Assignment 8.20
1. In several applications, another constraint of minimum spanning tree is known, called degree constraint minimum spanning tree. In the minimum spanning trees resulting from the preceding algorithms, a vertex v_i do not have any constraint on neighbours, that is, its degree varies as 1 £ degree(v_i) £ N – 1. Modify the Prim’s algorithm to impose a constraint on the number of neighbours, maximum neighbours possible for any vertex is q, say. 2. In a region, there are six thermal power stations, and electrical lining that is possible among various power stations are shown in Figure 8.41. Costs of electrification (rupees in crore) involved appear as weights on the edges.



Fig. 8.41

Obtain the minimum possible connection among the thermal stations so that any two thermal stations can be linked up with minimum cost involved.

Euler’s and Hamiltonian Circuits

In the previous sections, we have seen few paths like shortest path, path of traversal from a given vertex to any other vertex, etc. The following are a few more important paths widely known in the application of graph.

(i) Euler’s path and circuit

(ii) Hamiltonian path and circuit

Euler’s path and circuit

Euler’s path in a graph is defined as a path that includes every edge exactly once. If an Euler’s path starts and ends at the same vertex then it is called an Euler’s cycle or Euler’s circuit. The graph which consists an Euler’s circuit is called an Euler’s graph. The problem whether a given connected graph has an Euler’s circuit or not was first solved by the famous Swiss mathematician Leonard Euler in 1736 which marked the beginning of graph theory. He posed the Konigsberg’s bridge problem (see Section 8.1, Figure 8.2(c)), found its solution, and proved a theorem which is stated here as Lemma 8.7.

Lemma 8.7

A given connected graph G is an Euler’s graph if and only if all the vertices of G are of even degree.

Proof: If we follow the Euler’s path as it passes through a vertex v, we find that it must enter via an edge and exit via another edge. Therefore, the path crossing the vertex v occurs two edges at a time, and if crosses more than once then total number of edge is even. For an Euler’s circuit, the end vertex is same as the start vertex. The circuit begins by leaving it along an edge and ends by entering to it along another edge. So here, too, the edges occur twice. Hence the lemma.

As a slight modification of Lemma 8.7, we have a corollary.

Corollary: If there is an Euler’s path in a graph that starts and ends at different vertices then the starting and ending vertices has odd degree, and the rest have even degree.

In Figure 8.47, few graphs are presented and applying the above theorem, we can easily determine that G1 has an Euler’s path and G2 has an Euler’s circuit. Tracing of Euler’s path in G1 and Euler’s circuit in G2 are left to the reader as problem.

Fig. 8.47 To test for Euler’s paths and circuits.

It is more convenient to consider Euler’s circuits in a graph than Euler’s paths. As we see, the Euler’s path and Euler’s circuit problems, though slightly different, have the same basic theory for their solution. Thus, we will consider the Euler’s circuit problem as a general solution.

In the next section, we will discuss how to obtain an Euler’s circuit of a graph. We will assume that the given graph is an Euler graph. The method initially picks a vertex, say v_i randomly. It then follows repeated application of the following three steps till all the edges are not covered in the Euler’s circuit.

Step 1: Perform depth first search starting with v_i to obtain a circuit at v_i. Let this partial Euler’s circuit (PEC) at v_i be

a = [v_i v_j. . . v_k v_l v_m. . . v_x v_i]

Step 2: Scan the PEC a to find the first vertex which has not traversed edge (it means it is already not covered by any previous PEC). Let this vertex be v_l. Do step 1 with v_l. Let the PEC at v_l be

b = [v_l v_p. . . v_y v_l]

Step 3: Splice the PEC b at v_l of a. Thus, the elongated PEC a becomes

a = [v_i v_j. . . v_k v_l v_p. . . v_y v_l v_m. . . v_x v_i]

Go to step 2, if there is a vertex which still has not traversed edge in it.

To understand more clearly, the above described method is illustrated with an example as in Figure 8.48. Initially, the vertex 1 is selected and DFS traversal starting at 1 results the PEC(1) = [1-2-3-1]. Edges which are visited are marked as dashed. The vertex 2 is the first vertex in PEC(1) which has not traversed edge, so next DFS starts at vertex 2 and PEC(2) results the PEC(1) = [1-2-5-4-2-3-1]. In these PEC(1), the first vertex having not traversed edge is 5. So, DFS traversal at 5 results PEC(5) = [5-7-6-5]. Therefore, we obtain, PEC(1) = [1-2-5-7-5-4-2-3-1]. The vertex 6 is then found as first vertex in PEC(1) which has still not traversed edge, and PEC(6) = [6-4-3-6]. Splicing this with PEC(1), we get PEC(1) = [1-2-5-7-6-4-3-6-5-4-2-3-1]. There is no more vertex in this latest PEC(1) which has not traversed edge. The procedure is terminated successfully giving the Euler’s circuit as shown in Figure 8.48.

Detail of the algorithm for the above mentioned method is left as an exercise to the reader.

Assignment 8.23

1. Trace the Euler’s path and Euler’s circuit in the graph G1 and G2, respectively as given in Figure 8.47 using the method described.

2. We have discussed the Euler’s circuit in an undirected graph. In the same way, it can be applied to directed graphs also. Verify that:

(a) A directed graph has Euler’s circuit if and only if it is strongly connected.

(b) Every vertex has equal indegree and outdegree.

Hamiltonian path and circuit in a graph

In contrast to Euler’s paths, where all edges are traversed exactly once, a Hamiltonian path is one in which all vertices are traversed exactly once. Despite the similarity of definitions the theory of Hamiltonian paths is significantly more intricate than the theory of Euler’s path.

Similarly, a Hamiltonian circuit in a graph G is a circuit in which each vertex of G appears exactly once except for the starting vertex (it is also ending vertex), which appears just twice.

A graph is a Hamiltonian graph if it has a Hamiltonian circuit.

In Figure 8.49, Hamiltonian path and Hamiltonian circuit in graph G1 and G2 is shown. Obviously, not every connected graph has a Hamiltonian circuit. For example, the graph G3 (as shown in the figure) is not a Hamiltonian graph.

Fig. 8.48 Tracing of Euler’s circuit in an Euler’s graph.

Fig. 8.49 Graphs to test Hamiltonian paths and circuits.

The question, therefore that arises: How to decide whether a given graph has a Hamiltonian circuit or not? This problem first posed by the famous Irish mathematician Sir William Rowan Hamilton in 1859, which is still unsolved. However, there are a few observations which give us idea to decide whether a graph has a Hamiltonian circuit or not. These are mentioned below.

1. A complete graphs of three or more vertices have Hamiltonian circuit (necessary and sufficient condition).

2. In a connected graph with n vertices, for any pair of vertices v_i, v_j that are not connected by an edge, and if degree(v_i) + degree(v_j) ³ n, where degree(v) denotes the degree of a vertex v, then there is a Hamiltonian cycle on the graph (this is known as Ore’s theorem and gives a sufficient but not necessary condition).

3. In a simple connected graph G, if v_i be any vertex in G and degree(v_i) ³ n/2, where degree(v_i) denotes the degree of vertex v_i and n is the number of vertices, then the graph G has a Hamiltonian circuit. This theorem is due to G. A. Dirac, and also gives sufficient but not necessary condition.

(Proofs of the above mentioned conditions are beyond the scope of this book).

There is no formal method known till time to trace the Hamiltonian circuit, if it exists, in a given graph. The problem in fact is known as NP complete problem. Theoretically, however, the problem can be solved using ‘brainless’ method: taking all possible permutations of all the vertices in the graph and then trace the graph according to a sequence of vertices in a permutation and test whether it satisfies the criteria of Hamiltonian circuit or not.

Assignment 8.24

A problem closely related to the question of Hamiltonian circuit problem is the traveling salesman problem. The problem is stated as below.

A salesman is required to visit a number of cities during a trip. Given the distances between the cities, in what order should he trave1 so as to visit every city precisely once and return to the starting city, with the minimum mileage traveled?

The traveling salesman problem can be solved using a method called the greedy method, which is stated as below:

1. Start from any city.

2. From a city currently being visited move to the nearest neighbour city (that is, the city with minimum distance).

3. Go to step 2, if all the cities are not visited.

4. Return to the starting city.

A graph as shown in Figure 8.50 represents a the cities and their connectivity for the traveling salesman problem. Determine a tour using the greedy method.

(a) Find a solution of the problem using the greedy approach as stated above.

(b) Is the solution so obtained guaranteed to be the minimum?

Fig. 8.50 Traveling salesman problem.